Volatility polarization of non-specialized investors' heterogeneous activity 
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Financial markets provide an ideal frame for studying decision making in crowded environments. 
Both the amount and accuracy of the data allows to apply tools and concepts coming from physics 
that studies collective and emergent phenomena or self-organised and highly heteregeneous systems. 
We analyse the activity of 29 930 non-expert individuals that represent a small portion of the whole 
market trading volume. The very heterogeneous activity of individuals obeys a Zipf 's law, while syn- 
chronization network properties unveil a community structure. We thus correlate individual activity 
with the most eminent macroscopic signal in financial markets, that is volatility, and quantify how 
individuals are clearly polarized by volatility. The assortativity by attributes of our synchronization 
networks also indicates that individuals look at the volatility rather than imitate directly each other 
thus providing an interesting interpretation of herding phenomena in human activity. The results 
can also improve agent-based models since they provide direct estimation of the agent's parameters. 

PACS numbers: 89.65.Gh, 05.45.Tp, 89.75.Fb, 02.50.Le 



Collective behavior in socioeconomic contexts is be- 
coming more and more empirically studied from math- 
ematical and physical point of view since large amount 
of data is now available. This leads to the emergence of 
a new data-driven research area Most intrigued as- 
pects in this area concern how microscopic interactions 
trigger macroscopic phenomena and how individuals re- 
act to such current macroscopic bath showing some 
analogy with magnetism and Ising Model 0-01. Nowa- 
days, human activity leaves a digital trace that can be 
correlated to global information flows. Both the amount 
and accuracy of the data makes financial trading floors 
to be ideal scenarios to study the linkage between collec- 
tive and individual human activity. Recent research has 
related aggregated market trading volume with search 
queries in Google or Yahoo considered as a macroscopic 
field ^5-7], but individual activity data is not easily ac- 
cessible for research purposes. Just a few papers have 
been published with this sort of data [§-11 1 and none 
of them focuses on the relationship between microscopic 
and macroscopic levels. In this regard, the main purpose 
of the paper is to answer the question about how do non- 
expert investors make decisions and whether their actions 
are a response to a given macroscopic field with rather 
unique data from an Spanish investment firm. Since the 
volume traded by these investors represents a very small 
subset of the whole volume of the market participants, 
we can therefore observe the influence or polarization of 
macro-level signals over the microscale, but not the other 
way around [s, 0|- Moreover, the conclusions can be eas- 
ily extrapolated to the broadest context of human deci- 
sion making since we deal with non-expert agents. 

Every individual decision is materialised in an opera- 
tion, either buying or selling, and each investor trades 
with his/her own money. We here analyse a dataset 
that contains 3 303 695 individual recordings from 29 930 
clients of a particular investment flrm. Price, date and. 



number of shares traded from each transaction were 
stored on a daily basis. Individuals were trading between 
2000 and 2007 in 120 assets of the Spanish stock mar- 
ket, IBEX. During this period the market had no general 
trend. We limit our analysis to the 8 most traded assets 
by our investors: Telefonica (TEE), Ezentis (EZE), So- 
gecable (SGC), Santander (SAN), BBVA, Red Electrica 
(ELE), Repsol (REP) and Zehia (ZEL), that belong to 
different economic sectors in order to seek universality. 
As we focus on human activity, we filter the automatic 
operations still retaining 84.5% of the TEF data (worst) 
and 99.2% of the SGC data (best). Table [T] summarizes 
the data subset. 

It is known that human activity is bursty and non- 
homogeneous in time and several descriptions have been 
proposed [l^ . Figure [T]/^ displays the complementary cu- 
mulative distribution function (CCDF) of individuals' ac- 
tivity showing a robust power-law with an exponent very 
close to 1 (Zipf's law), both for the aggregated data and 
for each stock separately. The exponent coincides with 



the one found in several contexts |12h14| and particularly 
for Nokia expert traders Q. It can however be argued 
that the heterogeneous activity profile is simply due to 
the fact that the time period between the first and the 
last operation is different among investors and thus we 
cannot infer a different decision-making profile for each 
individual. Inset of Fig. [T)3 partially supports this argu- 
ment indeed, since there is a linear relation between the 
number of operations and the number of trading days, 
even though data points are widely scattered. There- 
fore, the number of operations has been normalised and 
the distribution of the number of operations per trading 
day (OpD) has been represented in Fig. [TJ3. From main 
Fig. [TJ3 we conclude that heterogeneity is still preserved 
with a tail index that equals 1.29. Those individuals that 
operate very infrequently, less than 1 operation per day, 
represent around 75% of the population, while there is 
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Ticker Individuals Operations Trading Volume 



A) 



TEF 


11571 


273, 150 


91.62 X 10^ 


SAN 


9450 


159, 798 


100.24 X 10^ 


BBVA 


8549 


128,006 


38.77 X 10® 


ELE 


6226 


89, 452 


13.87 X 10® 


REP 


5655 


81,045 


12.06 X 10® 


EZE 


2696 


60, 421 


2.37 X 10® 


SGC 


4182 


57,816 


1.56 X 10® 


ZEL 


3694 


52,601 


1.21 X 10® 



TABLE I. Total number of clients, number of operations and 
trading volume (in euros) for studied assets. 



an investor with around 30 operations per trading day 
on average. 

We define the activity 0(i)* of investor i as the number 
of operations performed at day t. We then understand 
T* as the time period starting the day that i did the 
first operation and ending when i did the last operation. 
We note that the active period T* may also include non- 
trading days, so the total number of trading days of 
i follows inequality TV* < T*. As shown in Fig. [5]A, we 
compute cross-correlation between investors i and j with 
simultaneous activity during a period T*-' > 0. That is: 
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where Tp and are respectively first day and last day 
of i and j simultaneous activity. The bar represents the 
averages and the ct's denote the standard deviations of 
the i and j investor's activity over T*-'. Identical nota- 
tion will be used in forthcoming equations. In order to 
work with statistical robustness, we limit to those indi- 
viduals with at least 20 operations and focus on the most 
synchronized ones. To keep the most synchronized ones, 
we shuffle the values of 0{ty and 0{ty belonging to the 
time period T'-' and calculate cross-correlation again. If 
the original p^^ is below the threshold corresponding to 
the 0.01 p- value for all shuffled terms, we set p*-' to zero 
manually. This double filtering still maintains most of 
the operations and volume traded, as shown in Tab. |lll 
Figure shows the Repsol synchronisation network as 
an illustration. The nodes (individuals) without connec- 
tions are removed in the interest of clarity and weighted 
edges correspond to the filtered coefficients p*-' . The rest 
of stocks are reported in Tab. |lll We also report modu- 
larity of each stock which gives an idea of the network 
structure. Their values are similar to other social and bi- 
ological studied networks These magnitudes tell us 
that agents' activity is far from randomness and unveils 
some community structure. 

Interesting questions arising here are which relevant 
factors lead individuals to make decisions and if agents 
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FIG. 1. Activity properties. A) Activity CCDF of individu- 
als in the 8 assets (Top). The Hill tail indexes together with 
the index obtained from data aggregation, 1.07±0.03 (Down). 
B) The aggregate of the 8 assets activity CCDF of individu- 
als as a function of the number of operations per trading day. 
The inset represents the number of operations versus trading 
days jointly with a dotted line of unit slope. 



are influenced by macroscopic information. Recent re- 
search has correlated Google searches with trading vol- 
ume but here we aim to identify an endogenous 
variable at an individual level. In this sense, the most rel- 
evant macroscopic variable in terms of the market activ- 
ity is volatility rather than price [l^ . We work with the 
High-Low volatility v{t) deflned as the difference between 
the highest and the lowest price value divided by the open 
price of day t 



The easiest way to see how volatility 
influences our investors activity is by focussing on meso- 
scopic activity variables from the same day: 0{t), that is 
the total number of operations made by the studied in- 
dividuals. The dependences between 0{t) and i'{t) time- 
series are studied by computing linear cross-correlations 
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FIG. 2. A) Activity profile of two investors (i and j). T*-* is 
the intersection between active periods T' and T-*. B) Syn- 
chronization network for REP. The size of the nodes is pro- 
portional to OpD. The node's colour represents pQ^ from —1 
(red) to +1 (blue) and edge's thickness the weight p*-*. 



(Long). Due to the fact that volatility is a long memory 
process [l7| . the mean in the correlation formula is sub- 
tracted to avoid any bias. However, most of the studied 
investors have a short-term horizon. To consider this, we 
compute as well the correlation by subtrac ting the local 
mean with a 5-day Moving Average (Short) [17[. Table HIl 
shows significant correlation in both measures. So high 
volatility clearly affect clients' activity to operate both 
in long and short time horizons. 

After analysing the collective response to volatility, the 
individual response needs to be regarded. If the investor 
i is sensitive to risk fluctuations, activity 0{ty will not 
be time-homogeneous. We thus compute the correlation 
between volatility and number of operations 



{o{ty - o') (Ki) - z>) 



t=Tl, 



(2) 



The correlation only takes into account trading days 
{0{tY > 0). Once more, investors with less than 20 



POi' 

Long Short 



TEF 1240(10%) 
SAN 701(7%) 
BBVA 385(4%) 



ELE 
REP 
EZE 
SGC 
ZEL 



257(4%) 
252(4%) 
251(9%) 
135(3%) 
251(7%) 



204 146(74.74%) 
114 537(71.68%) 
90 719(70.87%) 
64107(71.67%) 
53 542(66.06%) 
43 717(72.35%) 
33 959(58.74%) 
30 611(58.19%) 



0.411 
0.385 
0.468 
0.463 
0.531 
0.265 
0.473 
0.423 



0.5560 
0.2795 
0.0858 
0.2399 
0.0972 
0.2139 
0.4789 
0.4228 



0.5259 
0.5259 
0.4175 
0.3983 
0.1504 
0.3507 
0.4684 
0.4893 



TABLE II. Number of investors with their number of opera- 
tions in the activity synchronization network after removing 
inactive and non- synchronized investors from the original data 
(remaining percentage in brackets). Modularity calculated by 
Louvain method (Mod.) is provided. Last columns show the 
correlation at mesoscopic level between 0{t) and ^{t). 




FIG. 3. Distribution of investors according its poi/ for the 
aggregate of the eight assets. Inset shows a scatter plot with 
all investors cross-correlation with original and shuffled data. 



trading days are excluded. Figure |3] shows that the pop- 
ulation distributed according to its individual response 
to volatility is not neutral nor symmetric. In all the 
studied stocks, the mean and the most probable value are 
shifted systematically to the positive side explaining the 
mesoscopic polarization reported in Tab. HIl The variance 
of the distribution is 1.8 times the variance of the shuf- 
fled case that decorrelates volatility and activity time 
series. Therefore the population of agents in terms of 
Pqj^ is sensitive to the studied macroscopic signal. The 
average shows an alignment to volatility and the vari- 
ance increases, demonstrating a rather diverse response 
to volatility. Moreover, the most active daily investors 
display a special sensitivity to daily volatility as can be 
seen inset of Fig. [31 

We can also guess that the activity profile of two agents 
with similar response to volatility should not be very 
different. The nodes colour in the synchronization net- 
work in Fig. |2}3 gives an idea of the cross-correlation 
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FIG. 4. Assortativity for all networks. Rewiring (green) and 
shuffling (blue) benchmark assortativities are shown. The 
error-bars provide 95% confidence interval (CI). 



for each individual and a previous statement can intu- 
itively be confirmed. We observe there that investors 
with similar response to volatility tend to be connected. 
We can go further with this idea in a more rigorous 
way by measuring the connectivity structure of a net- 
work through the assortativity by attributes according 
to the Pq^ value [11]. To do so, we first discretize all /Jq^, 
by keeping the integer of Pq^, x 100. Final score for the 
assortativity is calculated as 



■xy 



(3) 



xy 



where Cxy is the fraction of the links that join together 
nodes with values x and y for the discretized Pq^j- The 
variables ax = X^j, ^xy and hx — ^xy are respectively 
the fraction of edges that start and end at nodes with 
values X and y, and condition X^xj/ ^xy — 1 needs to be 
fulfilled. The assortativity values in all 8 stocks lie be- 
tween 0.08 and 0.20 as shown in FigureH) Although these 
values are not very high, they are absolutely significant 
since they are clearly over the confidence intervals of two 
random benchmarks. First, randomized network rewires 
all the edges preserving node's properties. Second, one 
preserves the network's topology but shuffle node's at- 
tributes p]jy In both cases, we keep population distri- 
bution and assortativity falls to 0. Similarly we perform 
the same analysis but taking into account the OpD prop- 
erty, instead. In this case, investors mostly synchronize in 
typical speculative assets (TEF, SAN, BBVA and REP). 

This paper highlights and quantifies the influence of 
macro- variables in individuals' activity at a micro-level. 
The empirical work deals with rather unique records from 
a large set of non-expert investors (clients of a firm) from 
the Spanish stock market and they allow us to make sta- 
tistically robust statements. The fact of the analyzed in- 
dividuals being non-professional makes it possible to ex- 
trapolate the results obtained to other contexts with the 



purpose of better understanding human activity sensitive 
to a common macroscopic signal 0, Q ■ We have first ob- 
served that the activity is strongly heterogeneous among 
the individuals with a power-law exponent close to 1 . The 
influence of a macroscopic signal, the volatility, has been 
revealed both at a mesoscopic level, all investor's commu- 
nity, and at a microscopic level, each individual. The gen- 
eral tendency of individuals to show positive p)j^ explains 
the positive correlation at mesoscopic levels. The analy- 
sis of the assortativity by /Oq^ over the activity synchroni- 
sation network has finally shown that the synchronisation 
among individuals takes place because investors are in- 
fluenced in the same way by the same macroscopic field. 
Providing, thus, an alternative explanation for herding 
behaviour. The results obtained can also be very useful 
to test and calibrate or to even improve existing agent- 
based models 0, [l^. And most importantly they en- 
hance the debate on rationality and decision-making in 
socioeconomic contexts based on physicist's perspective. 
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